Search CORE

224 research outputs found

Zero-Shot Certified Defense against Adversarial Patches with Vision Transformers

Author: Huang Yuheng
Li Yuanchun
Publication venue
Publication date: 19/11/2021
Field of study

Adversarial patch attack aims to fool a machine learning model by arbitrarily modifying pixels within a restricted region of an input image. Such attacks are a major threat to models deployed in the physical world, as they can be easily realized by presenting a customized object in the camera view. Defending against such attacks is challenging due to the arbitrariness of patches, and existing provable defenses suffer from poor certified accuracy. In this paper, we propose PatchVeto, a zero-shot certified defense against adversarial patches based on Vision Transformer (ViT) models. Rather than training a robust model to resist adversarial patches which may inevitably sacrifice accuracy, PatchVeto reuses a pretrained ViT model without any additional training, which can achieve high accuracy on clean inputs while detecting adversarial patched inputs by simply manipulating the attention map of ViT. Specifically, each input is tested by voting over multiple inferences with different attention masks, where at least one inference is guaranteed to exclude the adversarial patch. The prediction is certifiably robust if all masked inferences reach consensus, which ensures that any adversarial patch would be detected with no false negative. Extensive experiments have shown that PatchVeto is able to achieve high certified accuracy (e.g. 67.1% on ImageNet for 2%-pixel adversarial patches), significantly outperforming state-of-the-art methods. The clean accuracy is the same as vanilla ViT models (81.8% on ImageNet) since the model parameters are directly reused. Meanwhile, our method can flexibly handle different adversarial patch sizes by simply changing the masking strategy.Comment: 12 pages, 5 figure

arXiv.org e-Print Archive

Microbial Communities in Water during Red Tides along the Coast of China-A Case Study of Prorocentrum Donghaiense Red Tide in the East China Sea

Author: Hu Yuheng
Huang Bei
Mao Hongyue
Wei Na
Publication venue: 'Bilingual Publishing Co.'
Publication date: 07/02/2021
Field of study

Red tides are a major public hazard in the global oceans. The coast of the East China Sea is the sea area where red tide disasters are the most frequent and serious in China. In order to accurately grasp the occurrence of red tides in the coastal waters of the East China Sea, and to understand the microbial communities in the waters during the occurrence of red tides in the East China Sea, a special survey of red tides in the coastal waters of Zhejiang, China was carried out in June 2018. The results showed that nutrient concentrations of N and P were generally high in this area, DIN concentrations in most areas exceeded the permitted limit of Chinese seawater quality grade I. There were significant differences in dissolved oxygen, pH, COD, chlorophyll and phytoplankton abundance of red tides. During the investigation, red tides were found in the waters near the Yushan Islands. The content of chlorophyll a was 42.12mg/m3, the cell abundance of phytoplankton was 8.16×108/L, and the abundance of Prorocentrum edulis accounted for 98.5%. The Illumina MiSeq sequencing platform was used for 16s high-throughput sequencing of water microorganisms, and a total of 16 bacteria were identified. Proteobacteria is the first dominant phylum, followed by Cyanobacteria and Bacteroides. Some differences in bacterial community compositions between HAB and the nearby seawater were observed. The predominant bacteria in the red tide occurrence area were Proteobacteria, comprising 46.1% of the relative abundance; while the predominant bacteria in the nearby sea area, comprising 42.0% of the relative abundance

Bilingual Publishing Co. (BPC): E-Journals

TSTTC: A Large-Scale Dataset for Time-to-Contact Estimation in Driving Scenarios

Author: Guo Xiaojie
Huang Zehao
Shi Yuheng
Wang Naiyan
Yan Yan
Publication venue
Publication date: 06/09/2023
Field of study

Time-to-Contact (TTC) estimation is a critical task for assessing collision risk and is widely used in various driver assistance and autonomous driving systems. The past few decades have witnessed development of related theories and algorithms. The prevalent learning-based methods call for a large-scale TTC dataset in real-world scenarios. In this work, we present a large-scale object oriented TTC dataset in the driving scene for promoting the TTC estimation by a monocular camera. To collect valuable samples and make data with different TTC values relatively balanced, we go through thousands of hours of driving data and select over 200K sequences with a preset data distribution. To augment the quantity of small TTC cases, we also generate clips using the latest Neural rendering methods. Additionally, we provide several simple yet effective TTC estimation baselines and evaluate them extensively on the proposed dataset to demonstrate their effectiveness. The proposed dataset is publicly available at https://open-dataset.tusen.ai/TSTTC.Comment: 19 pages, 9 figure

arXiv.org e-Print Archive

Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models

Author: Chen Huaming
Huang Yuheng
Ma Lei
Song Jiayang
Wang Zhijie
Publication venue
Publication date: 16/07/2023
Field of study

The recent performance leap of Large Language Models (LLMs) opens up new opportunities across numerous industrial applications and domains. However, erroneous generations, such as false predictions, misinformation, and hallucination made by LLMs, have also raised severe concerns for the trustworthiness of LLMs', especially in safety-, security- and reliability-sensitive scenarios, potentially hindering real-world adoptions. While uncertainty estimation has shown its potential for interpreting the prediction risks made by general machine learning (ML) models, little is known about whether and to what extent it can help explore an LLM's capabilities and counteract its undesired behavior. To bridge the gap, in this paper, we initiate an exploratory study on the risk assessment of LLMs from the lens of uncertainty. In particular, we experiment with twelve uncertainty estimation methods and four LLMs on four prominent natural language processing (NLP) tasks to investigate to what extent uncertainty estimation techniques could help characterize the prediction risks of LLMs. Our findings validate the effectiveness of uncertainty estimation for revealing LLMs' uncertain/non-factual predictions. In addition to general NLP tasks, we extensively conduct experiments with four LLMs for code generation on two datasets. We find that uncertainty estimation can potentially uncover buggy programs generated by LLMs. Insights from our study shed light on future design and development for reliable LLMs, facilitating further research toward enhancing the trustworthiness of LLMs.Comment: 20 pages, 4 figure

arXiv.org e-Print Archive

Question Decomposition Tree for Answering Complex Questions over Knowledge Bases

Author: Bao Yuheng
Cheng Sitao
Huang Xiang
Qu Yuzhong
Shu Yiheng
Publication venue
Publication date: 13/06/2023
Field of study

Knowledge base question answering (KBQA) has attracted a lot of interest in recent years, especially for complex questions which require multiple facts to answer. Question decomposition is a promising way to answer complex questions. Existing decomposition methods split the question into sub-questions according to a single compositionality type, which is not sufficient for questions involving multiple compositionality types. In this paper, we propose Question Decomposition Tree (QDT) to represent the structure of complex questions. Inspired by recent advances in natural language generation (NLG), we present a two-staged method called Clue-Decipher to generate QDT. It can leverage the strong ability of NLG model and simultaneously preserve the original questions. To verify that QDT can enhance KBQA task, we design a decomposition-based KBQA system called QDTQA. Extensive experiments show that QDTQA outperforms previous state-of-the-art methods on ComplexWebQuestions dataset. Besides, our decomposition method improves an existing KBQA system by 12% and sets a new state-of-the-art on LC-QuAD 1.0.Comment: Accepted by AAAI202

arXiv.org e-Print Archive

Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding

Author: Cai Mu
Huang Zeyi
Lee Yong Jae
Li Yuheng
Wang Haohan
Publication venue
Publication date: 09/06/2023
Field of study

Recently, large language models (LLMs) have made significant advancements in natural language understanding and generation. However, their potential in computer vision remains largely unexplored. In this paper, we introduce a new, exploratory approach that enables LLMs to process images using the Scalable Vector Graphics (SVG) format. By leveraging the XML-based textual descriptions of SVG representations instead of raster images, we aim to bridge the gap between the visual and textual modalities, allowing LLMs to directly understand and manipulate images without the need for parameterized visual components. Our method facilitates simple image classification, generation, and in-context learning using only LLM capabilities. We demonstrate the promise of our approach across discriminative and generative tasks, highlighting its (i) robustness against distribution shift, (ii) substantial improvements achieved by tapping into the in-context learning abilities of LLMs, and (iii) image understanding and generation capabilities with human guidance. Our code, data, and models can be found here https://github.com/mu-cai/svg-llm

arXiv.org e-Print Archive

Molecular state interpretation of charmed baryons in the quark model

Author: Hu Xiaohuang
Huang Hongxia
Ping Jialun
Wu Yuheng
Yan Ye
Yang Youchang
Publication venue
Publication date: 22/11/2022
Field of study

Stimulated by the observation of

\Lambda_c(2910)^+

by the Belle Collaboration, the

S

-wave

qqq\bar{q}c~(q=u~\text{or}~d)

pentaquark systems with

I

= 0,

J^P

\frac{1}{2}^-,~\frac{3}{2}^- and~\frac{5}{2}^-

are investigated in the framework of quark delocalization color screening model(QDCSM). The real-scaling method is utilized to check the bound states and the genuine resonance states. The root mean square of cluster spacing is also calculated to study the structure of the states and estimate if the state is resonance state or not. The numerical results show that

\Lambda_{c}(2910)

cannot be interpreted as a molecular state, and

\Sigma_{c}(2800)

cannot be explained as the

ND

molecular state with

J^P=1/2^-

\Lambda_{c}(2595)

can be interpreted as the molecular state with

J^P=\frac{1}{2}^-

and the main component is

\Sigma_{c}\pi

\Lambda_{c}(2625)

can be interpreted as the molecular state with

J^P=\frac{3}{2}^-

and the main component is

\Sigma_{c}^{*}\pi

\Lambda_{c}(2940)

is likely to be interpreted as a molecular state with

J^P=3/2^-

, and the main component is

ND^{*}

. Besides, two new molecular states are predicted, one is the

J^P=3/2^-

\Sigma_{c}\rho

resonance state with the mass around 3140 MeV, another one is the

J^P=\frac{5}{2}^-

\Sigma_{c}^*\rho

with the mass of 3188.3 MeV.Comment: 12 pages, 3 figure

arXiv.org e-Print Archive

LUNA: A Model-Based Universal Analysis Framework for Large Language Models

Author: Huang Yuheng
Juefei-Xu Felix
Ma Lei
Song Da
Song Jiayang
Xie Xuan
Zhu Derui
Publication venue
Publication date: 22/10/2023
Field of study

Over the past decade, Artificial Intelligence (AI) has had great success recently and is being used in a wide range of academic and industrial fields. More recently, LLMs have made rapid advancements that have propelled AI to a new level, enabling even more diverse applications and industrial domains with intelligence, particularly in areas like software engineering and natural language processing. Nevertheless, a number of emerging trustworthiness concerns and issues exhibited in LLMs have already recently received much attention, without properly solving which the widespread adoption of LLMs could be greatly hindered in practice. The distinctive characteristics of LLMs, such as the self-attention mechanism, extremely large model scale, and autoregressive generation schema, differ from classic AI software based on CNNs and RNNs and present new challenges for quality analysis. Up to the present, it still lacks universal and systematic analysis techniques for LLMs despite the urgent industrial demand. Towards bridging this gap, we initiate an early exploratory study and propose a universal analysis framework for LLMs, LUNA, designed to be general and extensible, to enable versatile analysis of LLMs from multiple quality perspectives in a human-interpretable manner. In particular, we first leverage the data from desired trustworthiness perspectives to construct an abstract model as an auxiliary analysis asset, which is empowered by various abstract model construction methods. To assess the quality of the abstract model, we collect and define a number of evaluation metrics, aiming at both abstract model level and the semantics level. Then, the semantics, which is the degree of satisfaction of the LLM w.r.t. the trustworthiness perspective, is bound to and enriches the abstract model with semantics, which enables more detailed analysis applications for diverse purposes.Comment: 44 pages, 9 figure

arXiv.org e-Print Archive

Bridging the Gap between Chemical Reaction Pretraining and Conditional Molecule Generation with a Unified Model

Author: Ding Yuheng
Huang Bo
Liu Ningfeng
Liu Zhenming
Qiang Bo
Song Song
Zhang Liangren
Zhou Yiran
Publication venue
Publication date: 24/08/2023
Field of study

Chemical reactions are the fundamental building blocks of drug design and organic chemistry research. In recent years, there has been a growing need for a large-scale deep-learning framework that can efficiently capture the basic rules of chemical reactions. In this paper, we have proposed a unified framework that addresses both the reaction representation learning and molecule generation tasks, which allows for a more holistic approach. Inspired by the organic chemistry mechanism, we develop a novel pretraining framework that enables us to incorporate inductive biases into the model. Our framework achieves state-of-the-art results on challenging downstream tasks. By possessing chemical knowledge, our generative framework overcome the limitations of current molecule generation models that rely on a small number of reaction templates. In the extensive experiments, our model generates synthesizable drug-like structures of high quality. Overall, our work presents a significant step toward a large-scale deep-learning framework for a variety of reaction-based applications

arXiv.org e-Print Archive